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FOREWORD 


This  memorandum  was  presented  at  the  Military  Policy  Evaluation: 
Quantitative  Applications  workshop  conference  hosted  by  the  Strategic 
Studies  Institute  in  mid- 1977.  During  the  workshop  sponsored  by 
DePaul  University  and  the  Strategic  Studies  Institute, ^academic  and 
government  experts  presented  the  latest  findings  of  formal  models  and 
statistical-mathematical  approaches  to  the  processes  of  military 
decisionmaking,  assistance,  intervention,  and  conflict  resolution. 

The  Military  Issues  Research  Memoranda  program  of  the  Strategic 
Studies  Institute,  US  Army  War  College,  provides  a forum  for  the 
timely  dissemination  of  analytical  papers  such  as  those  presented  at  the 
workshop. 

This  memorandum  is  being  published  as  a contribution  to  the  field 
of  national  security  research  and  study.  The  data  and  opinions 
presented  are  those  of  the  author  and  in  no  way  imply  the  indorsement 
of  the  College,  the  Department  of  the  Army  or  the  Department  of 
Defense. 


Major  General,  USA 
Commandant 
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MEASURING  INFORMATION  CONTENT 
IN  LONG-RANGE  FORECASTS 


In  the  past  5 years  there  has  been  a dramatic  shift  in  the 
requirements  for  technological  forecasts  of  enemy  weapons  systems. 
For  example,  many  forecasts  now  require  the  estimation  of  annual 
traces  of  upper  and  lower  boundaries  within  which  the  real  values 
should  fall  a specified  percentage  of  the  time.  Thus  were  an  analyst 
forecasting  the  number  of  foxbats  for  the  next  10  years,  he  might  be 
asked  to  present  for  each  of  the  next  10  years,  an  upper  and  lower 
estimate  such  that  there  would  be  a 75  percent  chance  that  the  real 
value  would  fall  within  the  range.  In  addition,  analysts  are  often  asked 
to  present  their  best  or  most  probable  estimates.  Furthermore,  for  any 
given  weapon,  there  may  be  a complete  set  of  such  forecasts  including 
for  example  not  only  the  number  of  weapons  forecast  into  the  future, 
but  also  a number  of  the  performance  characteristics  of  the  weapon. 
Indeed,  in  some  instances,  even  textual  descriptions  such  as  anticipated 
mission  are  now  being  systematized  by  requiring  that  a probability  be 
associated  with  each  alternative. 

While  the  requirements  for  these  types  of  forecasts  differ  somewhat 
by  agencies,  the  short-term  trends  are  apparent  and  in  some  cases 
already  reality.  That  is,  wherever  possible,  analysts  are  being  asked  or 
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required  to  express  their  estimates  numerically  with  probability 
estimates  and  best  estimates. 

One  of  the  associated  difficulties  (aside  from  the  difficulty  of 
actually  performing  the  task)  is  that  the  array  of  numbers  cannot  easily 
be  summarized  by  the  planners  who  are  the  recipients  of  the  forecasts. 
The  first  decision  concerning  any  forecast  which  a planner  must  make  is 
whether  it  is  anything  against  which  he  or  she  should  bother  to  develop 
plans.  While  there  are  many  criteria  which  the  planner  should  use  to 
make  such  a judgment,  one  of  the  primary  factors  should  be  the 
amount  of  information  which  is  contained  in  the  estimate.  Presumably, 
a forecast  which  contains  very  little  information  should  be  taken  less 
seriously  than  would  one  which  is  quite  informative.  Of  course,  if  a 
weapon  which  looks  potentially  quite  menacing  has  very  little 
information  describing  it,  a very  reasonable  action  would  be  to  improve 
the  information  content.  Then  if  it  continues  to  appear  as  menacing, 
plans  should  be  seriously  considered  to  account  for  this  addition  to  the 
enemy’s  arsenal.  However,  at  present,  such  decisions  are  complicated  by 
the  fact  that  there  is  no  mechanism  to  develop  an  aggregate  measure  of 
the  quantity  of  information  contained  in  a given  set  of  forecasts  of  the 
various  aspects  of  a given  weapon  or  weapon  system. 

This  paper  will  present  a methodology  for  measuring  the 
information  content  of  the  total  set  of  forecasts  across  all  categories  for 
any  given  weapon. 

In  general,  these  types  of  forecasts  are  concentrated  attempts  to 
provide  planners  with  the  best  information  available  as  to  the  size, 
characteristics,  and  distribution  of  enemy  weapon  systems  in  the 
medium-term  future.  They  provide  a means  of  defining  the  threat  that 
exists  in  future  timeframes.  Let  us  assume  that  an  estimate  is  composed 
of  three  components:  a range  of  forecasts  of  (1)  the  future  order  of 
battle,  (2)  the  characteristics  and  performance  of  the  system,  and  (3) 
textual  information  pertinent  to  estimating  deployment  and  usage. 

The  analysts’  information  is  never  constant.  Instead,  it  varies  as  new 
intelligence  is  made  available  to  them.  Information  such  as  a shift  in 
emphasis  from  interceptor  aircraft  to  missile  defense  would  change  an 
analyst’s  estimate  of  the  number  of  each  system  likely  to  be  present  in 
the  future.  This  information  varies  in  the  degree  of  confidence  an 
analyst  attributes  to  it.  New  weapon  systems,  especially  those  in 
developmental  or  prototype  stages,  present  particularly  difficult 
problems.  The  shift  from  a paper  system  to  a prototype  test  system  and 
then  to  a deployable  system  all  represent  shift  stages  in  the  confidence 
an  analyst  feels  about  the  estimate. 
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Of  obvious  importance  to  planners  is  the  amount  of  information 
there  appears  to  be  in  a given  estimate.  More  information  would  allow 
them  to  better  perform  their  tasks  of  prescribing  force  structures  and 
contingency  plans  to  counter  the  threat.  The  more  information,  the 
more  definitive  they  can  plan  the  shift  in  forces  needed.  Providing  a 
measure  of  such  a signal  is  a surprisingly  easy  task.  The  method  derives 
from  a well-developed  body  of  knowledge  called  information  theory.1 
Although  it  may  seem  somewhat  surprising  that  we  can  measure 
information,  we  will  attempt  to  show  that  it  not  only  can  be  done,  but 
can  handle  some  extremely  complicated  side  issues. 

When  reduced  to  the  basics,  information  theory  is  the  log  of  the 
ratio  of  two  probabilities:  the  probability  assigned  by  analysts  that 
some  phenomena  will  occur  given  their  specific  knowledge  of  the 
situation  and  the  probability  which  would  have  been  assigned  assuming 
that  the  estimate  was  made  at  random.  The  former  is  usually 
information  specific  to  a given  case.  Thus,  if  you  are  trying  to  estimate 
the  number  of  foxbats  in  the  next  10  years,  you  might  have  received 
Soviet  planning  documents  which  laid  out  some  information  specific  to 
foxbats.  The  latter  probability  can  be  thought  of  as  actuarial  type 
information.  Let  us  say,  for  example,  that  you  had  data  on  the  overall 
trends  of  Soviet  interceptor  aircraft  but  no  detailed  information  on  any 
specific  interceptor.  Your  “random”  estimate  then  would  be  the  same 
for  each  of  the  types  of  Soviet  interceptors  and  they  would  reflect  the 
known  overall  distribution  of  that  class.  To  bring  the  example  a bit 
closer  to  home,  assume  that  you  were  trying  to  estimate  the  height  of 
an  unspecified  human  person  living  in  1977.  Your  random  estimate 
would  simply  be  the  average  height  of  all  humans  taken  from  whatever 
actuarial  tables  compute  such  numbers.  The  classes  can,  however,  be 
changed.  You  could,  for  example,  be  asked  to  estimate  the  height  of  an 
unspecified  adult  American  female.  Your  “random”  estimate  would  be 
different,  but  only  because  you  would  have  to  look  on  a different 
portion  ot  your  table.  The  “specific”  estimate  in  the  numerative  would 
have  to  be  based  on  additional  information  specifically  pertinent  to  the 
individual  whose  height  is  being  forecast. 

In  the  basic,  discrete  case,  the  relationship  between  probabilities  and 
information  is  stated  by  the  following  equation: 

I = log  (Ell))  (1) 

p(x) 

where  1 is  the  information  present  in  an  estimate,  p(y)  is  the  probability 
assigned  to  an  event  after  some  communication  (y)  has  been 


3 


transmitted,  received,  or  accumulated.  p(x)  is  the  probability  assigned 
to  an  event  (x)  if  we  had  chosen  at  random.  The  value  p(x)  is  the 
minimum  probability  of  event  (x)  because  we  can  always  do  as  well  as 
chance.  Stated  differently,  pure  chance  is  the  minimum  information 
available  to  an  analyst. 

We  find  that  it  is  convenient  to  state  equation  (1)  a little  differently. 
Because  we  are  dealing  with  the  log  of  the  ratio  of  the  probabilities,  we 
can  subtract  the  logs  to  get  the  following  equivalent  statement. 

I = log  (p(y))-log  (p(x))  (2) 

Dealing  directly  with  logs  of  probabilities  normally  is  confusing  because 
they  are  always  negative  numbers.  Therefore,  information  theorists 
have  invented  the  concept  of  “uncertainty”  which  they  call  H. 
Uncertainty  is  equal  to  the  negative  of  the  log  of  the  probability.  Thus 
uncertainty  is  a positive  number  which  gets  larger  as  probabilities  get 
smaller.  In  equation  (2),  Hy  = -log  (p(y))  and  Hx  = -log  (p(y)).  Hx  is 
called  the  maximum  uncertainty  because  it  is  the  uncertainty  associated 
with  choosing  at  random.  Hy  is  the  absolute  uncertainty  because  it  is 
the  uncertainty  associated  with  the  content  of  the  message  at  hand. 

The  information  in  the  message  then  is  equal  to 
I = Hx-Hy  (3) 

This  is  the  basic  working  equation  of  information  theory.  The 
information  in  a message  is  the  difference  between  the  uncertainty  of 
randomness  and  the  uncertainty  after  the  message  has  been  received.  In 
some  instances,  the  mathematics  associated  with  the  evaluation  of 
equation  (3)  can  become  very  tedious,  but  they  can  always  be  brought 
back  to  that  basic  question. 

The  use  of  equations  ( 1 )-( 3)  can  be  demonstrated  in  the  following 
example.  Assume  that  1 have  chosen  a number  between  1 and  10  and 
ask  you  to  guess  what  it  is  knowing  only  that  each  number  is  equally 
probable.  Under  these  conditions,  you  have  no  better  than  a 1 in  10 
chance  at  guessing  the  number  correctly.  Thus,  your  information  is 
equal  to:  j = jQg  (p  (y))  = j0g  (probability  after  the  message) 
p (x)  minimum  probability 

= log  log  1 = 0 or  I = Hx-Hy  = 3...-3...=0. 

Assume,  however,  that  a friend  whispers  in  your  ear  that  I do  not 
choose  my  numbers  randomly.  Rather,  he  tells  you,  I have  a strong 
preference  for  the  number  6 and  that  1 can  be  counted  on  to  choose  it 
about  40  percent  of  the  time. 
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Now  your  information  is: 

I _ probability  after  the  message  = jog  .40  = iog  4=2 
minimum  probability  .10 

or  I = Hx-Hy  = 3 ...  -1  ...  =2.  (Assuming  we  are  using  log  base  2.) 

In  most  of  our  cases  we  will  not  be  dealing  with  discrete  systems  like 
the  digits  between  1 and  10  but  with  continuous  functions  such  as 
speeds  and  weights  of  airplanes.  This  means  that  rather  than  dealing 
with  specific  probabilities,  the  equations  must  be  expressed  as 
continuous  probability  functions.  This  is  not  difficult  to  do,  but  we  will 
not  present  the  math  here.  It  is  sufficient  to  recognize  the  difference.2 

DEVELOPING  THE  MEASURES  OF  INFORMATION 

If  we  wanted  to  actually  estimate  the  uncertainty  in  the  forecasts  of 
weapons  systems  on  a practical  basis,  we  would  need  the  following  five 
measures. 

• maximum  uncertainty  (basic) 

• maximum  uncertainty  (modified) 

• weightings 

• absolute  uncertainty 

• corrections  for  quality  of  input  data 

For  the  purposes  of  clarification,  let  us  first,  however,  define  a small 
vocabulary  to  help  improve  the  chances  of  communicating  what  the 
research  orientation  looks  like. 

• weapons  category -a  broad  category  of  weapons  which  are 
normally  classed  together  such  as  fighter  aircraft  or  long-range  bombers. 

• weapon-a  specific  weapon  within  a weapons  category,  such  as  a 
foxbat. 

• weapons  component— some  part  or  characteristic  of  a weapon  such 
as  speed,  or  weight. 

• major  document  category— one  of  the  three  major  sections  of  the 
forecasting  document  (i.e.,  future  order  of  battle,  characteristics  and 
performance,  textual  discussion). 

• document  subcategory-one  of  the  specific  estimates  within  the 
C&P  or  textual  sections  of  the  document  (e.g.,  speed  or  mission). 
Having  provided  a few  basic  definitions  we  can  proceed  to  the 
methodology. 

MAXIMUM  UNCERTAINTY  (BASIC) 

General.  As  we  have  said  before,  the  basic  foundation  of  information 
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theory  is  that  the  information  content  of  any  message  is  equal  to  the 
reduction  in  the  uncertainty  from  what  we  would  have  known  had  we 
simply  chosen  at  random.  The  complexity  of  that  statement  is  unveiled 
when  we  begin  to  ask  ourselves  what  it  means  to  choose  at  random. 
What,  for  example,  is  the  distribution  of  estimates  of  speed  which  we 
would  obtain  by  randomly  guessing  the  speed  or  altitude  of  a foxbat. 
Clearly  the  “at  random”  aspect  of  any  estimate  is  bounded  by  a range 
of  previous  estimates  of  both  the  foxbat  and  other  planes  similarly 
configured. 

Information  theorists  have  come  to  some  unanimity  that 
randomness  is  a matter  of  perspective  of  the  viewer.  That  is, 
“randomness”  can  vary  considerably  depending  on  the  choice  of  the 
classification  system  used  by  the  analyst.  However,  once  we  are  able  to 
agree  on  a classification  system,  the  definition  of  randomness  becomes 
as  rigid  as  it  was  previously  ephemeral.  In  our  case,  the  classification 
system  is  obvious  -the  weapons  categories  of  the  forecast  document.  A 
foxbat,  for  example,  is  a member  of  the  class  of  Soviet  interceptor 
aircraft.  Thus,  were  we  estimating  the  speed  of  foxbats  at  random,  it 
would  be  assumed  that  we  would  know  the  distribution  of  speeds  of 
Soviet  interceptor  aircraft.  If  we  are  attempting  to  measure  the 
information  content  in  a knowledgeable  estimate  of  the  speed  of  a 
foxbat,  we  simply  have  to  measure  the  improvement  of  our 
knowledge-based  estimate  over  our  estimate  had  we  known  only  that  it 
was  a Soviet  interceptor  aircraft.  Our  general  problem  in  measuring 
maximum  uncertainty  then  is  to  identify  the  probability  distributions 
of  all  of  the  document  categories  and  subcategories  for  each  of  the 
weapons  categories  under  consideration. 

Future  Order  of  Battle.  Since  we  are  forecasting  the  number  of 
weapons  of  a particular  type  10-20  years  in  the  future,  we  need  to 
know  the  diversity  of  the  larger  system  across  the  last  20  years.  Thus, 
for  the  case  of  fighter  aircraft,  we  would  want  to  know,  since  1957, 
how  many  of  each  type  of  fighter  aircraft  the  Soviets  have  had.  From 
this  we  can  calculate  a measure  of  dispersion  of  the  weapons  system 
across  the  most  recent  20  years.3  It  is  this  dispersion  which  provides 
the  historical  or  “random”  knowledge  base  for  calculating  the  ratio  of 
current  knowledge  to  historical  knowledge. 

Characteristics  and  Performance.  The  operations  here  are  identical  to 
those  described  above  except  that  they  must  be  done  for  each 
subcategory  within  the  C&P  section.  Thus,  for  fighter  aircraft,  we 
would  have  maximum  uncertainty  for  speed,  range,  etc.  for  all  of  the 
subcategories  which  we  anticipate  using. 
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Textual.  Since  the  categories  in  the  textual  section  are  likely  to  be 
discrete  rather  than  continuous,  the  calculation  of  maximum 
uncertainty  is  likely  to  be  simpler.  For  example,  it  is  highly  likely  that 
we  will  be  able  to  use  random  sampling  methods  to  determine,  for 
example,  the  probability  that  a Fighter  aircraft  will  be  primarily  used 
for  close  air  support  rather  than  dogfighting.  Alternately,  for  those 
questions  addressed  in  the  textual  section,  it  may  be  possible  to  use  the 
judgments  of  current  experts  on  the  weapons  category  under 
consideration. 


MAXIMUM  UNCERTAINTY  (MODIFIED) 

General.  Although  it  may  seem  somewhat  counterintuitive,  the 
maximum  uncertainty  of  a weapons  category  is  not  constant.  Aside 
from  the  fact  that  it  can  change  across  time,4  it  can  change  as  a 
function  of  the  relationship  between  the  variables  whose  values  are 
being  estimated.  Consider  the  following  example.  Let  us  say  that  your 
task  is  to  estimate  the  height,  weight  and  sex  of  a person  about  whom 
your  intelligence  sources  have  gathered  some  information.  Yet  before 
looking  at  your  information,  you  decide  to  calculate  your  basic 
measures  of  uncertainty.  Being  a reasonable  person,  you  take  out  your 
actuarial  tables  which  tell  you  that  the  probability  that  this  person  will 
be  male  is  .48,  and  that  the  means  and  standard  deviations  for  height 
and  weight  are  mh,  sh;  mw,  sw  respectively.  From  this  data  you 
proceed  to  calculate  the  uncertainties  Hm,  Ify,  and  Hw  respectively. 

Having  done  this,  you  open  your  packet  of  data  and  we  Find  that 
your  data  sources  tell  you  that  the  person  is  definitely  a male  and  that 
there  is  a 75  percent  chance  that  he  stands  between  6’6”  and  6’8”  tall. 
You  ask  where  the  data  on  weight  is  and  are  told  that  it  was  determined 
that  you  did  not  have  a need  to  know.  “Very  well,”  you  mutter  to 
yourself  and  decide  that  knowing  that  this  person  is  probably  6’6”  to 
6’8”  and  male  significantly  narrows  down  the  possible  range  of  weight 
so  you  look  at  your  tables  and  conclude  that  there  is  a 75  percent 
chance  that  he  weighs  between  230  and  250  lbs. 

The  question  is,  “How  much  information  have  you  provided  your 
planner  on  weight?”  The  answer  is  “none.”  You  get  credit  for  the 
reduction  on  uncertainty  on  height  and  sex,  but  given  that  you  knew 
their  values,  you  did  no  better  than  choose  the  weight  at  random.  What 
changed  was  the  base  from  which  your  randomness  was  computed.  By 
the  time  you  got  around  to  estimating  weight,  you  had  a very  easy 
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problem  to  solve  and  you  added  no  information  to  what  the  plannei 
could  have  figured  out  for  himself. 

This  adjustment  is  rather  easy  to  make  in  theory  although  it  can  get 
a bit  messy  in  practice.  Theoretically,  all  that  needs  to  be  done  is 
establish  the  relationship  between  the  variables  and  readjust  the 
maximum  uncertainty  given  the  knowledge  of  the  other  variables.  In 
the  example,  you  would  simply  have  identified  the  probabilities  that  a 
person  would  have  “x”  weight  given  your  knowledge  of  height  and  sex. 
Once  you  have  that  contribution,  you  may  either  integrate  it  or  assume 
normality  and  calculate  its  standard  deviation. 

In  our  case,  the  problem  is  only  slightly  more  tedious.  For  example, 
maximum  speed  and  weight  might  estimate  the  number  of  fighter 
aircraft  deployed.  Similarly,  wing  span,  weight,  and  wing  angle  might 
estimate  take-off  speed.  These  could  be  translated  into  linear  equations 
and  fit  against  the  same  historical  data  used  to  compute  the  basic 
maximum  uncertainty.  Each  of  the  equations  will  have  a measurable  fit 
with  the  data  called  R2.  The  value  of  R^  is  equal  to  the  amount  of 
variation  which  can  be  accounted  for  by  the  equation.  From  this  we 
can  compute  an  estimated  uncertainty.  If  the  estimated  uncertainty  is 
less  than  the  basic  uncertainty,  we  should  reduce  the  basic  uncertainty 
by  an  amount  proportional  to  the  R-.  Basically,  this  process  permits  us 
to  measure  one  of  the  sources  of  double  counting  of  the  same 
information  and  to  adjust  for  it  by  altering  the  value  of  the  maximum 
uncertainty.  Stated  differently,  this  adjustment  prohibits  the  analyst 
from  getting  credit  for  solving  particularly  easy  problems. 

Future  Order  of  Battle,  C&P,  Textual.  It  is  really  impossible  to 
discuss  these  categories  independently  because  the  essence  of  modified 
uncertainty  is  that  it  is  necessary  to  correct  for  the  effects  of 
interrelationships  between  categories  and  subcategories.  The  most 
difficult  practical  task  will  be  the  delineation  of  the  initial  set  of 
equations  to  be  tested.  Included  in  the  decision  to  test  certain 
equations  and  not  others  would  be  an  understanding  of  the  normal 
sequence  undertaken  by  the  analyst.  If,  for  example,  he  nearly  always 
writes  the  textual  section  last,  one  might  develop  equations  predicting 
mission  from  the  estimated  OB  and  certain  aspects  of  C&P.  If  the 
sequence  were  reversed,  the  structure  of  the  equations  would  be 
accordingly  altered.  If  both  sequences  are  likely,  the  computer  would 
have  to  scan  the  uncertainties  to  determine  the  most  probable  sequence 
for  that  case.  This  could  be  done  using  the  theory  of  causal  inference. 
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WEIGHTINGS 


General.  The  categories  within  a given  forecast  are  not  equally 
important.  Future  order  of  battle,  for  example,  may  be  more  or  less 
important  than  any  given  performance  characteristic,  but  it  is  unlikely 
to  be  equal  in  importance.  Because  of  this,  the  categories  must  be 
weighted.  The  weighting  of  the  major  categories  and  the  subcategories 
would  be  one  of  the  more  important  factors  in  measuring  the 
information  content.  Ultimately,  the  weighting  has  to  reflect  the 
thinking  of  the  planners  who  will  use  both  the  document  and  the 
information  measure.  There  is  much  that  could  be  done  to  attempt  to 
make  the  weightings  as  sound  as  possible.  One  option  would  require  a 
set  of  short  open-ended  discussions  with  users  of  the  forecast,  writers  of 
the  forecast,  users  and  developers  of  wargames  and  simulations,  and 
others  who  could  be  expected  to  be  knowledgeable  about  the  subject 
being  forecast.  From  them  one  might  hope  to  gather  three  types  of 
information:  (1)  the  weights  which  they  would  attach  to  the  document 
categories;  (2)  the  criteria  they  used  to  identify  these  weights;  and  (3) 
any  contingencies  which  they  felt  appropriate.  Some  of  the 
interviewees  might,  for  example,  believe  that  the  dollar  impact  on  the 
US  planners  must  be  the  primary  weighting  criterion  while  others  might 
believe  that  the  impact  on  Soviet  fighting  capability  should  be  most 
important.  One  might  also  Find  that  some  of  the  experts  would  posit  a 
number  of  contingencies  such  as  the  argument  that  the  estimated  type 
of  mission  could  well  influence  the  weightings  for  different  weapons 
within  a total  weapons  category,  while  others  might  argue  that 
variations  at  the  high  and  low  ends  of  the  range  of  uncertainty  are  not 
as  important  as  are  those  variations  around  the  middle  ranges  (i.e.,  the 
weightings  are  not  linear). 

From  these  suggestions,  the  most  dominant  patterns  of  weighting 
systems  should  be  programed  and  the  results  examined  for  face  validity. 
Ultimately,  of  course,  the  decisions  of  the  weighting  schemes  will  be 
determined  by  the  subjective  judgments  of  the  forecasters  and  planners 
who  develop  and  use  the  forecasts. 

ABSOLUTE  UNCERTAINTY 

General.  Absolute  uncertainty  is  that  measure  of  the  uncertainty  of 
the  specific  estimate  being  considered.  Of  all  of  our  measures,  it  is 
certainly  the  easiest  to  compute  or  develop.  If  it  is  to  be  computed,  the 
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range  of  the  values  estimated  by  the  forecaster,  taken  in  conjunction 
with  the  relative  location  of  the  “best”  estimate  can  be  combined  with 
the  assumption  of  a skewed  Gaussian  distribution  to  permit  the 
calculation  of  the  absolute  uncertainty  for  the  future  OB  section  and 
for  each  of  the  subcategories  of  the  C&P  section.  For  the  discrete  case 
in  the  textual  section,  the  logs  of  the  probabilities  provided  by  the 
analyst  are  all  that  are  needed. 

Alternatively,  absolute  uncertainties  can  be  subjectively  estimated 
by  the  analysts.  They  simply  have  to  provide  a number  for  each  of  the 
major  categories  (within  the  limits  set  by  the  weightings,  of  course). 

Future  Order  of  Battle.  Whether  developed  subjectively  or 
numerically,  there  is  but  one  number  defining  the  absolute  uncertainty 
of  the  future  order  of  battle.  Subjectively,  it  would  be  a number 
assigned  by  the  analysts  having  been  told  what  the  weightings  are  and 
having  been  provided  with  a calibration  instrument  so  that  they  know 
what  scores  should  be  attached  to  varying  levels  of  information 
content. 

Numerically  the  absolute  uncertainty  is  approximately  equal  to  the 
standard  deviation  of  the  range  of  estimates  for.  a given  weapon. 
Specifically,  it  is  the  integral  of  the  log  of  the  probability  function 
times  the  probability  function.  That  is, 

Hy  = *p(y)log(p(y))dy5 
Where  Hy  is  the  absolute  uncertainty 

p(y)  is  the  probability  density  function  implied  by  range  of  the 
estimates. 

Characteristics  and  Performance.  The  calculation  of  absolute 
uncertainty  for  this  section  is  done  somewhat  differently  depending  on 
whether  it  is  done  arithmetically  or  subjectively.  When  it  is  done 
subjectively,  a single  estimate  for  the  entire  category  is  supplied  by  the 
analyst.  For  this  instance  particularly,  the  subjective  estimate  should  be 
capable  of  capturing  some  of  the  uniqueness  involved  in  estimating 
various  elements  of  this  particular  weapon  in  this  particular  time 
period. 

When  numerical  methods  are  used,  the  range  of  estimates  and  the 
best  estimates  (if  they  are  provided)  would  be  used  as  in  the  future  OB 
case  to  provide  the  definition  of  the  probability  density  functions. 
However,  in  the  C&P  case,  these  functions  would  be  developed  (by  the 
computer)  for  each  of  the  subcategories.  It  is  then  a trivial  task  to 
compute  the  integrals  to  provide  the  estimate  of  the  absolute 
uncertainty  for  each  of  the  subcategories.  These  would  then  be 
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combined  into  a total  for  the  category  using  the  method  to  be 
described  below  in  the  section  “Putting  It  All  Together.” 

Textual.  The  absolute  uncertainty  for  the  textual  section  would  be 
very  easy.  Subjectively  it  would  be  computed  just  like  the  C&P  section. 
Numerically,  information  theory  permits  the  direct  processing  of  the 
probability  of  discrete  events.  There  is  no  need  for  an  intermediary  step 
of  computing  a probability  density  function.  The  uncertainty  of  any 
subcategory  (say  mission)  is  simply  the  log  of  the  probability  as 
assigned  by  the  analyst.  As  in  the  case  of  the  C&P  section,  the 
subcategories  can  be  combined  as  we  will  show  later  in  this  proposal. 

CORRECTIONS  FOR  THE  QUALITY  OF  INPUT  DATA 

General.  One  of  the  problems  in  developing  forecasts  such  as  we  are 
considering  is  that  there  is  no  rigorous  method  of  controlling  for  the 
variations  in  the  quality  of  the  historical  data  upon  which  the  forecasts 
are  based.  In  some  instances,  forecasts  are  based  on  a solid 
well-documented  historical  tracking  of  performance  and  deployment. 
In  other  cases,  those  data  are  of  inconsistent  quantity  and  quality.  For 
new  systems,  there  may  be  no  more  than  sketchy  documentation  of 
limited  aspects  of  some  weapons  which  exists  only  on  Soviet  drafting 
boards. 

Because  it  is  difficult  to  include  the  data  errors  in  the  forecast,  one 
option  which  is  occasionally  used  is  to  make  the  forecasts  as  though  the 
data  were  perfect  and  then  to  provide  an  additional  estimate  of  the  data 
accuracy.  Subsequently,  the  planner  uses  subjective  judgment  to 
evaluate  the  forecast.  This  approach  has  been  less  than  satisfactory. 

An  alternative  is  to  compute  the  information  measure  including  data 
error.  We  know  from  classical  statistics  that  because  of  randomizing 
effects,  errors  in  summary  measures  (such  as  these  forecasts)  decrease 
predictably  as  the  simple  quantity  of  data  increases.  Specifically,  it 
decreased  proportionately  with  the  square  of  the  number  of  data  points 
minus  one. 

This  fact  enables  us  to  adjust  for  the  value  of  the  probability 
function  in  the  computation  of  the  absolute  uncertainty.  It  is  easy  to 
show  that  we  need  only  add  the  term  log  (p(l-e))  to  equation  (3).  The 
term  “e”  refers  to  the  estimated  error  in  the  forecast  due  to  the  data 
error. 

So  far  we  have  assumed  that  the  error  in  the  data  and  the  range  of 
estimates  are  unrelated.  In  fact  this  is  unlikely  to  be  true.  Despite 
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instructions  to  the  contrary,  analysts  realizing  the  weakness  of  their 
data  probably  hedge  their  estimates  more  than  they  might  have  done 
had  they  thought  the  data  were  better.  To  correct  for  this,  we  can 
measure  historically  the  strength  of  the  relationship  between  the 
quality  of  the  data  and  the  range  of  estimates.  The  strength  of  that 
relationship  is  the  R2  of  a regression  equation  and  tells  us  the  amount 
of  overlap  we  can  normally  expect  between  the  two  variables.  Thus  we 
would  want  to  subtract  the  overlap  from  the  estimate  of  information. 
This  can  be  done  very  simply. 

I = log  (d(  1 -e))  - R2  . log(pd-e))  + Hx  - Hy  (4) 

= (l-R^) . log  (p(l-e))  + Hx  - Hy 

While  the  adjustment  of  the  information  content  for  varying  levels 
of  quality  of  input  data  is  not  a trivial  exercise,  perhaps  its  most 
important  role  for  this  correction  mechanism  comes  in  the  ability  to 
alter  the  information  content  for  systems  which  have  yet  to  be  put  into 
production. 

Although  we  do  have  information  on  these  weapons,  our  problem  is 
that  we  do  not  know  at  present  how  to  evaluate  it  relative  to  standard 
time  series  of  conventional  historical  data.  To  address  this  question,  it 
would  be  necessary  to  establish  subjectively  some  rules  of  thumb  about 
the  quality  equivalence  of  the  type  of  intelligence  data  for  a developing 
weapon  and  “hard”  historical  facts  about  existing  and  operating 
weapons  through  discussions  with  analysts  and  others  knowledgeable 
about  the  subject  being  forecast.  These  rules  of  thumb  could,  for 
example,  be  a function  of  the  number  of  years  that  a particular  weapon 
has  been  in  developmental  stages.  Thus  the  first  year  that  some  new 
weapon  is  reported,  the  information  content  may  be  severely 
downgraded.  In  subsequent  years  the  adjustment  may  move  slightly 
upwards.  Thus,  as  a weapon  moves  from  planning  through  development 
to  deployment  the  error  estimates  will  slowly  decrease.  After  the  initial 
deployments,  the  more  rigorous  methods  at  accounting  for  error  would 
be  employed. 


PUTTING  IT  ALL  TOGETHER 

We  have  presented  a number  of  concepts  feeding  into  information 
theory  in  one  manner  or  another.  Now,  we  will  attempt  to  tie  them  all 
together  into  a fairly  short  coherent  package.  We  start  with  the 
assumption  that  we  have  three  sections  for  which  there  are  forecasts: 
future  order  of  battle,  characteristics  and  performance,  and  textual 
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description.  The  latter  two  contain  several  identifiable  subsections  for 
which  there  are  forecasts  associated  with  probabilities  and/or  ranges 
and  best  estimates.  Future  order  of  battle  forecasts  only  one 
phenomena  which  provides  us  with  a range  and  a best  estimate. 

We  have  suggested  a method  which  would  compute  information 
content  measures  for  each  of  these  elements,  weigh  them,  make 
corrections  for  interrelationships  between  the  variables  and  correct  for 
faulty  input  data  and  add  them  together.  The  final  information  measure 
could  range  from  zero  to  100  where  zero  is  no  information  and  100  is 
certainty.  Let  us  quickly  review  the  method  for  accomplishing  this  task. 
We  know  that  for  any  single  estimate,  the  information  is  the  log  of  the 
ratio  of  the  probability  of  the  estimate  to  the  probability  had  we 
chosen  at  random 


In  this  equation  p(y)  is  roughly  equivalent  to  the  range  of  any  given 
forecast.  The  p(x)  is  roughly  equivalent  to  the  range  if  the  analyst  had 
known  nothing  about  the  weapons  except  the  category  within  which  it 
fits.  It  is  convenient  to  define  variables  Hy  and  Hx  which  are  equal  to 
minus  the  logs  of  the  numerator  and  denominator  respectively.  They 
are  formally  called  the  absolute  uncertainty  and  the  maximum 
uncertainty.  Therefore,  the  information  for  a given  estimate  such  as 
speed  or  mission  is  simply  the  difference  between  them. 

I = HX  • Hy 

We  found,  however,  that  the  maximum  uncertainty  is  not  a constant.  It 
is  a function,  in  part,  of  the  relationships  between  the  forecast  under 
immediate  investigation  (say  speed  of  a foxbat)  and  the  other  variables 
in  the  estimates  (say  weight  and  armament).  Therefore  the  value  of  Hx 
has  to  be  changed  by  a two-step  process  which  leaves  us  with  the 
following  slight  modification: 

I = flX  - Hy 

where  Hx  implies  that  we  are  dealing  with  the  second  estimate  of  Hx. 

Even  the  single  estimate  is  still  not  complete  for  we  have  not  yet 
corrected  for  the  variations  in  the  quality  of  the  input  data.  This  is  a 
straightforward  calculation  which  results  in  the  addition  of  a small 
negative  number  to  the  information  content.  That  revision  creates  the 
following  modification: 

I = (1*r2)  log  (p(  1-e))  + Hx  - Hy  [V 

Where  is  a measure  of  the  amount  of  data  error  the  analyst  has 
already  included  in  his  estimate  and  e is  the  measure  of  how  much  error 
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the  data  inaccuracies  will  introduce  into  the  forecasts.  This  correction  is 
particularly  important  because  it  is  the  mechanism  by  which  the 
information  measures  of  weapons  systems  under  development  will  be 
adjusted  downward. 

To  this  point,  we  have  presented  a mechanism  by  which  the 
information  content  of  any  specific  estimate  can  be  measured.  The 
measures  are  all  generated  on  identical  scales  and  are  therefore  additive. 
We  have  not  yet  accounted  for  the  fact  that  the  various  parts  of  the 
total  estimate  are  of  obviously  different  importance.  Estimates  of  the 
mission  of  a fighter  aircraft  are  not  exactly  as  important  as  the 
estimates  of  its  speed  or  of  the  number  of  aircraft  which  are  expected 
to  be  placed  in  service  20  years  from  now. 

To  account  for  this  problem,  it  would  be  necessary  to  develop  a 
weighting  system  for  each  of  the  specific  estimates.  This  would  almost 
certainly  be  a two-stage  process  for  which  future  order  of  battle 
estimates  might  hypothetically  be  given  a maximum  of  50  points, 
characteristics  and  performance  30  points  and  the  textual  section  20 
points.  Within  the  latter  two  sections,  the  subcategories  could  be 
assigned  points  assuring  that  the  total  could  not  exceed  the  maximum 
for  the  overall  category. 

These  weightings  would  then  be  multiplied  to  the  entire  information 
measure  for  any  given  estimate.  Thus  if  there  were  a weight  Wj  for 
some  specific  estimate  (e.g.,  future  order  of  battle)  the  information  for 
future  OB  would  be 

II  = W]  [(1-R^j) log(p(l-e|)  + HX] ' Hyi] 

Although  this  equation  applies  to  the  estimate  of  only  one  specific 
component  of  a specific  type  of  weapon,  information  theory  has  the 
advantage  that  the  units  of  “I”  are  always  comparable  assuming  the 
proper  factors  have  been  taken  into  consideration.  Thus,  if  we  had  a 
very  simple  weapon  (e.g.,  some  type  of  fighter  aircraft)  with  two 
components  of  C&P  (e.g.,  weight  and  speed)  and  two  components  of 
text  (e.g.,  mission  and  location),  we  would  have  a measure  of  the 
information  for  each  of  these  four  elements  and  for  the  estimate  of  the 
future  order  of  battle.  Therefore  we  could  array  the  information  as 
shown  below: 

* future  OB  = Wj  I future  03 

1 C&P  = w2j  1 speed +W22  IWeight 
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I text  ^3j  * mission  + ^2  * location 
*total  = * future  OB  + ^ C&P  + ^ text 


The  system  envisioned  in  this  paper  could  operate  in  such  a manner 
that  an  analyst  could  employ  a computer  work  space  advantageously  in 
his  task  The  computer  software  for  such  a system  could  operate  with 
virtually  no  human  intervention  if  the  analyst  or  planner  wanted.  The 
analyst  would  be  storing  his  estimates  and  a subjective  estimate  of 
absolute  uncertainty  which  he  can  update  at  any  time.  We  believe  that 
he  should  be  using  a computer  work  space,  updating  it  with  new 
information  which  can  go  final  at  any  point  in  time  since  it  contains 
up-to-date  information.  There  should  be,  also,  the  opportunity  for 
supervisors  to  intervene  and  override  the  system  defaults  in  a number  of 
points.  For  example,  the  supervisor  should  have  the  option  of 
suppressing  the  computation  of  the  selected  interrelationships  between 
subcategories  in  the  event  that,  for  one  reason  or  another,  they  do  not 
apply  to  the  weapons  subsystem  under  consideration.  If,  for  example, 
certain  VTOLs  were  listed  in  the  category  of  fighter  aircraft,  many  of 
the  interrelationships  based  on  fixed  wing  aircraft  would  be  totally 
irrelevant. 

Finally,  when  operating  under  default,  the  system  would  compute 
the  absolute  uncertainty  based  on  the  spread,  best  estimates,  and  the 
appropriate  skewed  Gaussian  probability  distribution.  We  suggest  that  a 
better  system  would  provide  the  user  with  the  option  of  selecting 
among  a small  number  of  alternate  distributions.  While  we  expect  that 
this  option  would  be  rarely  used,  it  could  be  an  important  exercise  in 
the  event  that  there  were  serious  disagreements  about  some  particular 
estimate  and  the  associated  information  content. 
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KNDNOTKS 


1.  See  Shannon  and  Weaver  (1949),  Goldman  (1968),  Thiel  (1967),  and 
Garner  (1962)  for  examples  of  similar  applications  and  development  of  the 
theory. 

2.  See  Goldman  ( 1 968)  and  Watanabe  ( 1 969)  for  developments. 

3.  The  actual  number  of  years  (20)  in  this  example  is  not  intended  to  be  the 
number  needed  in  real  estimates.  It  may  vary  from  weapon  system  to  system. 

4.  Because  average  speeds,  weight,  numbers  deployed,  etc.,  change  across  time, 
the  averages  from  which  maximum  uncertainty  are  computed  will  be  expected  to 
change. 

5.  For  Gaussian’s  distribution,  this  is  the  standard  deviation. 
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